An Efficient Natural Language Processing System Specially Designed for the Chinese Language

نویسندگان

  • Lin-Shan Lee
  • Lee-Feng Chien
  • Long Ji Lin
  • James Huang
  • Keh-Jiann Chen
چکیده

In this paper an efficient natural language processing system specially designed for the Chinese language is presented. The center of the present system is a bottom-up chart parser with head-driven operation; i.e., phrases are built up by starting with their heads and adjoining constituents to the left or right of the heads instead of strictly from left to right. In this way many more unnecessary searching actions can be effectively eliminated. The present system also includes several efficient approaches such as a direction-selective chart to simplify the control of the head-driven operation; a heuristic scheduling policy and a bidirectional look-ahead approach to eliminate many unnecessary searching actions, and an improved raise-bind mechanism combined with check rules to treat the difficult problems of movement transformations and empty categories and to simplify the design of grammar rules. The present design is based on careful consideration of some special syntactic phenomena of the Chinese language, such as head-final and head-initial structures and empty categories. A prototype of the present system has been successfully implemented and extensive experiments have been performed. In the test results significant improvement in the efficiency in processing many very complicated Chinese sentences has been observed. The detailed discussion on the various approaches, the overall system design, and the experimental results will all be presented in this paper.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Chinese Natural Language Processing System Based Upon the Theory of Empty Categories

In this paper, we will present a device specially designed on the basis of the theory of empty categories. This device cooperates with a bottom-up parser and is used as an elegant and efficient approachtotreatthetroublesome problems of the transformations of passivization,relativizatlon; toplcalization, ba-transformation and the use of zero pronouns in Chinese natural language. With the aid of ...

متن کامل

Efficient Reverse Converter for Three Modules Set {2^n-1,2^(n+1)-1,2^n} in Multi-Part RNS

Residue Number System is a numerical system which arithmetic operations are performed parallelly. One of the main factors that affects the system’s performance is the complexity of reverse converter. It should be noted that the complexity of this part should not affect the earned speed of parallelly performed arithmetic unit. Therefore in this paper a high speed converter for moduli set {2n-1, ...

متن کامل

Efficient Reverse Converter for Three Modules Set {2^n-1,2^(n+1)-1,2^n} in Multi-Part RNS

Residue Number System is a numerical system which arithmetic operations are performed parallelly. One of the main factors that affects the system’s performance is the complexity of reverse converter. It should be noted that the complexity of this part should not affect the earned speed of parallelly performed arithmetic unit. Therefore in this paper a high speed converter for moduli set {2n-1, ...

متن کامل

سیستم شناسایی و طبقه‌بندی موجودیت‌های اسمی در متون زبان فارسی بر پایه شبکه عصبی

Named Entity Recognition (NER) is a fundamental task in natural language processing and also known as a subset of information extraction. We seek to locate and classify named entities in text into predefined categories such as the names of persons, organizations, locations, expressions of times, etc. Named Entity Recognition for English texts has been researched widely for the past years, howev...

متن کامل

Language Selection at the Time of Processing Anger: A Case Study of Turkish-Persian Bilinguals

Recent research declares the influence of bilingualism on many cognitive and emotional processes. The aim of the present study is investigating the role of bilingualism in processing anger in Turkish-Persian bilinguals’ first (L1) and second (L2) language. To achieve this goal, 18 Turkish-Persian sequential bilinguals (with an average age of 26) who were students of Tehran universities were sel...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computational Linguistics

دوره 17  شماره 

صفحات  -

تاریخ انتشار 1991